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Abstract 

In this paper, we extend the vertical modeling approach for the analysis of 
survival data with competing risks to incorporate a cured fraction in the population, 
that is, a proportion of the population for which none of the competing events can 
occur. The proposed method has three components: the proportion of cure, the 
risk of failure, irrespective of the cause, and the relative risk of a certain cause of 
failure, given a failure occurred. Covariates may affect each of these components. 
An appealing aspect of the method is that it is a natural extension to competing 
risks of the semi-parametric mixture cure model in ordinary survival analysis; thus, 
causes of failure are assigned only if a failure occurs. This contrasts with the existing 
mixture cure model for competing risks of Larson and Dinse, which conditions at the 
onset on the future status presumably attained. Regression parameter estimates are 
obtained using an EM-algorithm. The performance of the estimators is evaluated 
in a simulation study. The method is illustrated using a melanoma cancer data set. 


1 Introduction 

In medicine, the risk of failnre from a given disease or the chance of getting cured of it are 
of interest to the diseased patients as well as to the treating physicians; this helps patients 
at risk to make important life decisions and it helps physicians in treatment selection. 
However, investigating this type of information in a clinical study involving time-to-event 
data requires accounting for multiple possible outcomes: either failure, due to the disease 
or due to some other cause, or cure. From the statistical point of view, the analysis 
of several types of failures is incorporated in the competing risks framework. These 
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methods usually assume that all patients will eventually experience one of the possible 
types of failure if there is sufficient follow-up and, therefore, do not accomodate cure. The 
presence of cure is suggested when the data includes a considerable number of long-term 
event-free survivors (censored patients with long follow-up times). Therefore, extending 
modeling of competing risks to accommodate a cured proportion is an important issue in 
understanding this type of data. 

In the analysis of single outcome survival data with a cured fraction, cure models 
address the problem of cure rate estimation, as well as the estimation of the probability of 
failure due to the disease of interest. They have received a lot of attention both in terms of 


Yu et al. 

(2004) 

Kim et al. 

(2013 

) and applications 

Andersson et al.|i 

2011 ; 

Andrae et al. 


(2012). These cure models are mixture models which specify a conditional model for the 


survival component, given that failure may occur, and a marginal distribution of the 
binary indicator for whether or not cure can occur. They are formulated either in a 


parametric or in a semi-parametric way (Kuk and Chen (1992); Taylor (1995); Sy and 


Taylor (2000); Peng and Dear (2000); Peng (2003); Corbiere et al. (2009)). They share 
the common feature of allowing the cure rate to be determined from the onset, but to 
be observed only later in the course of the follow-up, leading to the presence of a sub- 
popnlation of event-free snrvivors beyond sufficiently long follow-np. 

Analysis of competing risks data with a cure fraction is not well developed. Sev¬ 


eral authors caste the competing risks model of Larson and Dinse (1985) into the cure 


framework. These models express the mixing joint distribntion of the time-to-event and 
type of event variables as the sum of the marginal cause-specihc distribntions mnltiplied 
by the associated mixing proportions. This approach is formulated under the strong, 
unverihable assumption that the mixing proportions for failure types and cnre indicator 
is determined bnt unobserved at the onset. In terms of formulas, this amounts to the 
following decomposition of the joint distribution of time of failure T and failure type D: 


( 1 ) 


P{T,D) = P{T\D) ■ P{D). 


Examples include the approach of Chao (1998), which imputes the cure indicator for 


censored patients and uses a Gibbs sampling algorithm for estimation; Ng and McLachlan 


(1998) propose a parametric version, to be used when there are only few failures from the 


competing causes and Choi and Zhou (2002) discuss extensively a class of multivariate 


parametric models. This approach could be adopted in contexts where the interest is in 
assessing the parameters of the conditional failnre time distribntion given failure type or 
in the mixing distribntion of the different competing failure types. 

However, from the inference and interpretation points of view, the case where the 
model parameters translate directly into natural observable quantities in competing risks 
is appealing. In this paper, we adopt this perspective and introduce a semi-parametric 
approach for the analysis of competing risks data with a cure fraction. This approach 


extends the idea of vertical modeling formulated earlier by Nicolaie et al. (2010) in the 


competing risks framework, and it is based on the following decomposition of the joint 
distribntion of time of failnre T and failure type D\ 


( 2 ) 


P(T,D) = P{D\T)-P{T). 
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In the remainder of the article, we demonstrate how this can be applied to the analysis 
of mixture cure data with several competing causes of failure in the presence of right- 
censored data. In Section we introduce in detail our approach. Simulation studies are 
presented in Section In Section we illustrate our methods of analysis on a clinical 
study on melanoma cancer. Section concludes with some points for discussion. 


2 Vertical modeling with a cured fraction 

2.1 Notation 

Suppose that data are available from n individuals each of whom can experience one 
of J terminal, competing events during the period under study or can be subjected to 
noninformative right censoring. Assume that a non-negligible proportion of individuals 
does not experience any of the terminal events by the end of the follow-up and assume 
further that follow-up is long enough to be able to consider individuals with full follow¬ 
up as cured from the disease of interest on the basis of some clinical evidence. Such 
population can naturally be regarded as a mixture population in which two categories of 
individuals are combined: susceptible (the individual experiences failure, irrespective of 
the cause) and non-susceptible or cured (the individual is immune to all causes of failure). 
Let T denote the time-to-event variable, C the right-censoring time variable, and D the 
terminal event type, where D G {!,... ,J}. Assign D = 1 for the cause of interest and 
D > 2 for the competing causes. Let Z denote an Z-vector of covariates measured at 
baseline. 

Consider a binary random variable Y such that Y = 1 corresponds to a susceptible 
individual and Y = 0 corresponds to a nonsusceptible/cured individual. Note that in 
this type of study Y is partially observed; it equals 1 in the case of an event, and it is 
unobserved in the case of right-censoring. If the latter occurs, the individual has no event 
observed during the study period, but either the event will eventually take place (the 
individual is censored and susceptible) or the event will never take place in the future 
(the individual is censored and non-susceptible). 

The observed data for an individual i is Oi = (Ti, Ai, Zi), where Ti = min(Tj,C'i) is 
the earliest of time-to-event and censoring time, and A* = l{Tj < Ci}Di is the type of 
terminal event in the case a terminal event occurs and 0 in the case of censoring, for 
i = 1,... ,n. Data from different individuals are assumed to be independent. Assume 
that iT,D) and C are independent given Z. 

2.2 Model formulation 

Our goal is to develop and implement an approach to determine whether a terminal 
event would occur (whether an individual is susceptible, which hereafter is referred to as 
incidence) and, conditional on being susceptible, when the event might occur and which 
type of event it might be, given a failure occurred (which hereafter are jointly referred to 
as latency). In addition, we are interested in assessing covariate effects on incidence and 
latency. 

The incidence part is completely specihed by the probability distribution P{Y). De¬ 
note P{Y = 1) = p] thus, 1 — p represents the proportion of individuals who get cured. 
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For the latency part, we aim to extend the vertical modeling approach, earlier proposed 
by Nicolaie et ah (2010), to the mixture cure model framework. We shall refer to this 


competing risks mixture cure model as vertical modeling with a cured fraction (VMCF). 

The main idea behind the modeling of the latency part of VMCF is to specify the 
conditional (on Y = 1) joint distribution P[T,D\Y = 1) as 


(3) 


P(T, D\Y = l)= P{T\Y = 1) ■ P{D\T, Y = 1), 


that is, the product of the conditional (on Y = 1) failure rate P{T\Y = 1) and of the 
conditional distribution of the causes of failure, given a failure occurred P{D\T,Y = 1). 
If we assume that the survival time T is continuous, we dehne the conditional (on Y = 1) 
total hazard by 


(4) 


A.(t|V = l) 


lim 

—>^0 


P{t<T <t + Af|T >t,Y = l) 
At 


and its cumulative counterpart by A,(f|V = 1) = A,(m|V = l)du. The former specihes 
the conditional (on Y = 1) failure distribution P{T\Y = 1). In turn, the conditional 
(on Y = 1) survival function of susceptible individuals, dehned as S'(f|y = 1) = P(T > 
t\Y = 1) is given by 


(5) 


^(t|V = l)=exp(-A.(f|V = l)). 


We emphasize that S{t\Y = 1) is a proper survival function in the sense that hmi_,.oo S{t\Y = 
1) = 0. Note that the conditional (on V = 0) survival function of non-susceptible is de¬ 
generate, that is, P{T > t\Y = 0) = 1. We dehne the conditional (on Y = 1) relative 
cause-specihc hazard of cause j at time t by 

(6) 7r,{t\Y = 1) = P{D =j\f = t,Y = l), j = 1,..., J. 

Note that the probability 7ij{t\T = t,Y = 1) deals with failure time and cause, therefore 
its estimation involves only the (observed) susceptible individuals. This implies that (|^ 
can be expressed as: 

(7) P(D=j\f = t,Y = l) = PiD=j\f = t), j = l,...,J. 


As a consequence, we suppress the dependence on V = 1 of 7ij{t\Y = 1) and, in the 
following, we will denote it simply by vrj(f). Thus the vector (vrj(f))j=i^,,.^j denotes the 
conditional (on T = t) distribution P{D\T = t), with T,j7rj{t) = 1 for all t. Thus the 
vector (A,(t|y = 1), {7ij(t))j=i^„_j) completely specihes the latency part of VMCF. 

The conditional (on Y = 1) cumulative incidence function of cause j, dehned as 
Fj{t\Y = 1) = P(T < t, D = j\Y = 1), can be obtained as 

(8) Fj{t\Y = 1) =[ X,{u\Y = 1)71 j{u)S{u-\Y = l)du, j = 

Jo 

The marginal survival function, which is dehned as Spop{t) = P{T >t) can be expressed 
as 


(9) Spopit) = P{Y = l)P{T>t\Y = 1) + P{Y = 0)P{T>t\Y = 0) 

= p ■ S{t\Y = 1) -1-1 — p. 

Note that Spopit) is an improper survival function in the sense that limi^oo Spop(t) = 1—p. 
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2.3 Specific models 

For the incidence p(X) = P{Y = 1|X), where X C Z, we postulate the usual binary 
regression models: 

( 10 ) g{p{X))=l3^y.\ 

where X* = (1,X), (3 stands for a vector of unknown regression parameters, including 
an intercept, and 5 ^ is a known differentiable link function. This includes the logistic link 
model g{p) = logj^, the complementary log-log link g{p) = log(—log(l — p)) and the 
probit link g{p) = <l>“^(p), where is the distribution function of a standard normal 
distribution. 

For the conditional (on Y = 1) total hazard we postulate a Cox proportional hazards 
model: 


(11) A.(t|y = 1 , Z) = \oit\Y = 1) exp ( 7 ^Z), 

where Ao(-|h^ = 1) is an unspecihed conditional (on Y = 1) baseline hazard and 7 stands 
for a vector of unknown regression parameters. Let to = 0 and denote by ti < ^2 < • • • < 
tx the K ordered event times. Assume that Ao(t|F = 1) is a step function such that 

\o{t\Y = 1) = \o{ti\Y = 1), all t e 


for I e {0,...,K -1}. 

For the relative hazards we specify 


( 12 ) 


7r,((|U) 


exp(nj B(f) + vJU) 
Ef=iexp(KlB(t)+vlU)’ 




where U C Z, B(t) is an r-vector of pre-specihed time functions and r/j = (Kj,Vj) stands 
for an m-vector of unknown regression parameters, j = 1,..., J. For identihability, we 
set rjj = 0. Examples of B(t) are polynomial or spline functions. Denote by 6 the vector 
(/3, 7 , ? 7 i,..., r]j) of all regression parameters characterizing the components of VMCF. 


2.4 Marginal model 


It is also interesting to look at the relationship between the marginal (population) total 
hazard A,,pop(t) and the conditional (on Y = 1) total hazard \,(t\Y = 1). First, note 
that the conditional expectation of Y given T >t is determined by 
(13) 

EIYIT > *] = P(K = 1|T > 1) = £‘1^ = 1)- P ■ S(*|r = 1) 


P(T > t) 

The marginal total hazard is given by 

' .(«) pS((|K = l)A.((|y = l) 


(14) A., 


pop 


— S 

/ _ ^pOp\ 

^ ’ Spopit) 


pS{t\Y = 1) -I-1 — p 


pS{t\Y = 1) + 1 — p' 


= E[Y\T>t]X.it\Y = l), 


where S' stands for the derivative of S. Intuitively, the above relationship between the 
two hazards expresses the fact that A,,pop(t) is the average of the Y\,(t\Y = 1) taken 
over all the individuals at risk just before T = t. 
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(15) 


Note that 0 implies that the marginal total hazard is given by 


A.,pop(t|Z, X) = Xo{t\Y = 1) exp (7 ' Z) — 


g-\(3^:^*)So{t\Y = l) 


_ 1 \exp(7T“Z) 


g-^{l3^X*)So{t\Y = + 1 - g-\l3^X.*) ’ 


where S'o(t|l^ = 1) = exp(—Ao(t|F = 1)). This clearly shows that at the population level, 
the proportional hazards assumption postulated in the strata {Y = 1} no longer holds. 
Alternatively, we can write the model in the following way: 


(16) 


Vp„p(i|Z) = Ao(()exp(y(i)Z) , 


where Ao(t) is an unspecihed baseline hazard and ^{1) is an unknown, time-dependent 
vector of regression coefficients. For two individuals with covariate vectors Z and Z, the 
following relationship holds for the k-th component of the vector r]{t): 

(17) g^mZk - Zk) = ^k{Zk - Z,) + log , 

f;[f|t > f,z,x] 

which expresses that the log hazard ratio at the population level equals the sum of the 
log hazard ratio in the strata {Y = 1} and of a time-varying term. Derivation of this 
formula is given in Appendix A. 


2.5 Likelihood 

Omitting covariates for a moment, the observed likelihood is the product of contributions 
of individuals from two categories: an individual i who fails at time ti due to cause j 
contributes 

P{fi = u, A = j, L = 1) = P{Yi = 1) ■ P(T = u\Y, = 1) ■ P(A = m = U) ■ P(A > U), 

and an individual i who is censored at time ti contributes 

P(f > U) = [P{Y, = 1) ■ P(f > U\Y, = 1) + P{Yi = 0)] ■ P{Ci = ti). 

If we assume that the distributions of T and C have no common parameters, then 
we can omit the contribution of C to the likelihood; therefore, the observed likelihood is 
proportional to 

L = f[[p,P(T = A|y, = = 

i=l j=l 

n 

■ J] [p,P{f > U\Y, = 1) + (1 

i=l 

In terms of relative and (conditional) total hazards, the observed likelihood can be written 
as 

A = n = l)exp [-A,{ti\Yi = 1 )] JJ 

i=l j=l 

n 

■ JJ {piexp [-A,{ti\Yi = 1)] + (1 

i=l 
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which can be separated in the following way: 


(18) L{e, Xo{t\Y = 1)) = Li(/3, 7 , Xo{t\Y = 1); F) ■ 

where 


Li(/ 3 , 7 ,Ao(t|F = l);F) = J] = 1) exp [-A.(t,|F, = 1)] 

i=l 

n 

■ JJ {piexp [-JX,{ti\Yi = 1)] + (1 

i=l 




2.6 Estimation 


We will use a variant of the EM algorithm given in Sy and Taylor (2000) to estimate the 


parameters (0, Ao(ti|F = 1),..., Ao(ti<'|F = 1)) of the VMCF. The technique consists of 
an adaptation of the EM algorithm to deal with the latent Y and to accommodate our 
semi-parametric approach. Typically, it is assumed that all event-free survivors after tx 
get cured. This amounts to imposing the zero-tail constraint, that is, S{t\Y = 1) = 0 for 
t > tx, a. condition which assures that the observed likelihood is well-behaved (see also 


Sy and Taylor (2000)) 


Denote by C the complete data, that is, the sample data when the latent Y would be 
observed for all cases. The complete likelihood function is given by: 


Lc{0, Ao(t|F = 1)) = L,if3, 7, Ao(t|F = iy,Y) ■ ■ ■ ■ ,Vj), 


where 


L3(/3,7,Ao(t|F = l);F) = J] {p,A.(f,|F, = 1) exp [-A.(f,|F, = 1)] 

i=l 

■ n {{ftexp |-A.(i,|r. = . (1 -py(n=0) 


Note that Di > 0 implies = 1, and Fj = 0 implies Di = 0. After rearranging the 
factors, we get: 


L3(/3.7.V(*|F = l)iF) 


( 19 ) 


i=l 

n 

l[X.{U\Y, = l)i^^*>o>exp [-F,A.(f,|F, = 1)] 

i=l 

L3i(/3;F)-L32(7,Ao(t|F = l);F). 
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The maximization of logLc(0, Ao(t|h' = 1)) is enhanced by the observation that the 
multinomial logistic structure embedded in logL 2 (? 7 i,..., rjj) can be maximized indepen¬ 
dently of logL 3 (/ 3 , 7 , Xo{t\Y = 1); Y). An appealing fact is that this can be achieved by 
means of standard software like the PROC GLM in SAS or the glm function in R ([^ 
Development Core Team| (2010)). Denote by {r]i,... ,r]j) the estimator of {rji,... ,r]j). 


Maximization of log L^{f3, 7 , Xo{t\Y = 1); R) is more involved and uses the EM-algorithm 
following the approach in Sy and Taylor (2000). Technical details are given in Appendix 
B. 


2.7 Standard errors 


An approximation of the asymptotic variance of {6,Xo{ti\Y = 1),..., Ao(t_fc|R = 1)) 
can be obtained as the inverse of the observed full information matrix A 0 ,Ao(t|y=i) of 
L{9, Xo{t\Y = 1)). Note that due to the factorisation (18), we can write: 


'^0,Ao(iT=i) - 


/ ^^,7,Ao(t|y=i) 


1 0 

%) 


where is the observed full information matrix of Li(/3, 7 , Ao(t|R = 1);R) 

and is the observed full information matrix of ^ 2 (^ 1 , • • •, iIj)- 

) derived a formula for A/ 3 ^-^_A.( |y=i)- In Appendix C, we use 
their results to obtain the standard errors of the conditional (on Y = 1) cause-specihc 
cumulative hazards from vertical modeling with a cure fraction. 


Sy and Taylor (2000 


3 Simulations 

In this simulation study, we assess the performance of our estimators in the semi-parametric 
setting when the assumed model is correct. We simulate data for n = 500 individuals, 
each of whom can fail due to the disease (cause 1 ) or due to a competing cause (cause 
2 ), or can be subjected to right-censoring. Assume a non-negligible proportion of indi¬ 
viduals get cured. Individuals are followed over a period of at most 15 years; random 
right-censoring, which is independent of survival time, occurred uniformly in the interval 
[7,15] years. Assume Z ~ 77(0,1) is a continuous, normally distributed baseline co¬ 
variate. We generate data as random samples drawn from a VMCF model (the “true” 
model), where we assume that the cure indicator Y follows a logistic regression model; 

logit(P(F = l|Z))=/?o + A-^, 

for (/9 o,/3i) £ {(—0.62,1.24), (—1.38,0), (1.38,0)} leading to three scenarios with various 
amounts of cure proportions, that is, 65% for Z = 1, 80% and 20% respectively. The 
latency part is specihed by 

A.(f|F = l,Z) = Ao(t|r = l)exp(7Z), 

where 7 = 0.3 and the baseline hazard Ao(t|R = 1) is constant, equal to 0.4. The relative 
hazards are constant such that = 0.25 and vr 2 (t) = 0.75, favoring cause 2 over cause 

1 . 
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VMCF and VM are fitted to the simulated data. The incidence component of VMCF 
and the conditional (on Y = 1) total hazard are assumed to have the same form as in 
the true model. The relative hazards are assumed constant over time, with no covariate 
effect; the observed proportions of cause specihc events relative to the total number of 
events were employed as their estimates. The covariate Z was therefore included in the 
two mixture components (incidence and latency). The probabilities Fj{t\Yi = 1, Z^) and 
FjltjZi), j = 1,2, are reported at each of the prediction time points {1, 2, 5, 7,11} for an 
individual i with Zj = 1. In VM, a Cox proportional hazards model with Z as predictor 
with unspecihed baseline hazard is postulated for the total hazard. The relative hazards 
are identical with those in VMCF. The probabilities Fj{t\Zi), j = 1,2, are reported at 
each of the prediction time points {1, 2, 5, 7,11} for an individual i with Zi = 1. 

We reported in Tablesandthe estimated bias and root mean squared error (RMSE) 
obtained for our estimators under various proportions of susceptibles. Each scenario was 
run 10000 times. 


Table 1: Estimated bias (root mean squared error) Fj{t\Y = 1, Z = 1) with respect to the 
true cumulative incidence function Fj{t\Y = 1, Z = 1), j = 1, 2, for the three scenarios. 




65% cure 

in {Z — 1} 


80% 

cure 



20% 

cure 


t 

mY = 

1,Z = 1) 

F2{t\Y = 

1,Z = 1) 

Y{t\Y = 

1,Z = 1) 

F2{t\Y = 

l,Z= 1) 

FMY = 

1,Z = 1) 

F2(t\Y = 

1,Z = 1) 

1 

- 0.0001 

(0.0162) 

-0.0005 

(0.0314) 

-0.0003 

(0.0239) 

-0.0008 

(0.0500) 

-0.0004 

(0.0119) 

-0.0012 

(0.0247) 

2 

0.0001 

(0.0229) 

0.0003 

(0.0350) 

0.0000 

(0.0328) 

0.0002 

(0.0559) 

-0.0002 

(0.0164) 

-0.0006 

(0.0276) 

5 

0.0012 

(0.0302) 

0.0036 

(0.0337) 

0.0018 

(0.0419) 

0.0057 

(0.0495) 

0.0004 

(0.0208) 

0.0015 

(0.0240) 

7 

0.0017 

(0.0313) 

0.0052 

(0.0335) 

0.0030 

(0.0435) 

0.0092 

(0.0472) 

0.0008 

(0.0215) 

0.0027 

(0.0228) 

10 

0.0029 

(0.0321) 

0.0088 

(0.0339) 

0.0049 

(0.0447) 

0.0151 

(0.0476) 

0.0013 

(0.0219) 

0.0041 

(0.0225) 


Table 2: Estimated bias (root mean squared error) of Fj{t\Z = 1) with respect to the 
true cumulative incidence function Fj{t\Z = 1), j = 1, 2, in VM and VMCF respectively, 
when /3o = —0.62, /3i = 1.24 and 7 = 0.3. 




VM 



VMCF 


t 

'Fi{t\Z = l) 

F2(t|Z = 1) 

Fi(t|Z = 1) 

F2(t|Z = 1) 

1 

-0.0064 

(0.0119) 

-0.0193 

(0.0284) 

0.0000 

(0.0112) 

0.0002 

(0.0229) 

2 

-0.0086 

(0.0170) 

-0.0260 

(0.0366) 

0.0004 

(0.0160) 

0.0012 

(0.0281) 

5 

-0.0084 

(0.0217) 

-0.0254 

(0.0393) 

0.0012 

(0.0214) 

0.0037 

(0.0322) 

7 

-0.0071 

(0.0221) 

-0.0214 

(0.0376) 

0.0016 

(0.0222) 

0.0049 

(0.0330) 

10 

-0.0053 

(0.0222) 

-0.0159 

(0.0353) 

0.0024 

(0.0228) 

0.0073 

(0.0338) 


The proposed method of estimation VMCF shows little bias in estimating both 
Fj{t\Y = 1, Z = 1) and Fj{t\Z = 1), with bias (RMSE) increasing with the increase 
in the cure proportion. The bias contributes little to the RMSE. The behaviour at later 
time points is more uncertain due to the smaller number of events by the end of the follow 
up. Comparisons of Fj{t\Z = 1) derived from VMCF and VM with the true cause-specihc 
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cumulative incidences consistently show smaller bias (RMSE) for VMCF, thus supporting 
the idea that htting VMCF to data from a population with a non-neglijable proportion 
of cure can replace a conventional survival model when the interest is to estimate the 
disease frequency. 

The coverage rates for the normal approximations 95% conhdence intervals of Fj{t\Y = 
1, Z = 1) and Fj{t\Z = 1) are higher for VMCF compared to VM. They are reported in 
Table for the hrst scenario. The other two scenarios, with heavy or small amount of 
cure, show the same pattern (not reported). 


Table 3: Estimated coverage probabilities of Fj{t\Y = 1, Z = 1) and Fj{t\Z = 1), j = 1,2 
in VM and VMCF respectively, when /Iq = —0.62, /li = 1.24 and 7 = 0.3. 




VMCF 



VM 


VMCF 

t 

Fi(dT = 1,Z 

= 1) m\Y 

= 1,Z=1) 

FMZ = 

1) 'F2{t\Z = l) 

Fi{t\Z = 

1) F2{t\Z = l) 

1 

95.2 


95.4 

90.8 

85.1 

95.2 

95.5 

2 

95.1 


95.1 

91.6 

82.9 

95.0 

95.0 

5 

95.1 


95.1 

93.3 

86.8 

95.1 

95.2 

7 

95.2 


95.0 

94.2 

89.9 

95.1 

94.8 

10 

95.1 


94.4 

94.8 

92.5 

95.1 

94.6 


The performance of estimators /3o, /^i and 7 was studied in Sy and Taylor (2000). 
Since these parameters are separated from the parameters of the relative hazards model, 
the results obtained in Sy and Taylor (2000) are applicable to and were conhrmed in our 
setting and consequently have not been explored further. 


4 Data analysis 


Our data comprises 205 patients with malignant melanoma who had their tumor removed 
by surgery between 1962 and 1977. These data from Andersen et al. (1993) are part of 


the boot package (Canty and Ripley ( 2008[ )) for the R software. This was a prospective 
clinical study to assess the effect of risk factors on survival. Baseline covariates are 
gender, age (in years) at operation, year of operation, tumor thickness and ulceration of 
the tumor tissue. Age (scaled by 10 ), year of operation (centered at 1970 and scaled by 
10) and thickness were considered continuous covariates; age ranged from 4 to 95 years 
and thickness ranged from 0.10 to 17.42 mm. 61% of patients were men and 44% of 
patients presented with ulceration of the tissue at the time of surgery. The follow-up 
time ranged from 10 days to 15.23 years. The survival time is known only for those 
patients who have had their event before the end of 1977. The rest of the patients are 
censored at the end of 1977. Endpoint of interest is the time from operation to death due 
to melanoma (cause 1 ), in the presence of a competing cause, that is, death due to other 
causes, unrelated to melanoma (cause 2). The total number of deaths was 71: 57 (80%) 
deaths due to cause 1 and 14 (20%) deaths due to cause 2 . Of the 134 censored patients, 
27 were censored with longer follow-up time than the largest death time. Figure [T] shows 
the Kaplan-Meier estimate of the survival function of time to any terminal event. This 
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Kaplan-Meier estimate of overall survival 



Figure 1: The estimated Kaplan-Meier curve of time to all-causes failure. 


survival curve appears to reach a plateau 10 years post-surgery, indicating the presence 
of a sub-population which survives event-free by the end of the follow-up and clearly 
suggesting the appropriateness of a competing risks mixture cure model. For these data 
clinical interest is to understand how covariates affect the survival distribution, after 
surgery for melanoma, in the presence of competing risks. 

First a vertical model ignoring cure (VM) was £t to the data. This model serves 
two purposes: (1) to estimate the effect of covariates on time to death due to each of 
melanoma cancer and other causes and (2) to estimate the probabilities of each type of 
events since surgery. The model aggregates the covariate effects on latency and incidence 


in an intricate way (see, for instance, equation (15)). A Cox proportional hazards model 
was used for the total hazard including all covariates: 


A,(t|Z) = Ao(t)exp ( 7 ^Z), 


where Ao(-) is an unspecihed baseline hazard and 7 stands for a vector of unknown 
regression parameters, and a logistic regression model was used for the relative hazards 
including all covariates and piecewise constant time functions, with cut-off points at the 
0.25, 0.5, 0.75 quartiles of the failure time distribution leading to: 

(20) logit(7ri(t|T = t, Z)) = /s:7B(f) -|- vj Z, 


where B(t) = (l{t e (0,1.76]}, l{t G (1.76, 2.90]}, l{t e (2.90,4.67]}, l{t G (4.67,15.23]}) 
and rji = {ki,vi) stands for a vector of unknown regression parameters. The estimated 
regression parameters are reported in Table 

For illustration, we present the results of the model £t for a male patient with ulcer¬ 
ation and for average values of the continuous covariates. Table gives the estimated 
relative hazards implied by htting model (20) with associated standard errors. It can be 
seen that the dominating cause of death post surgery is melanoma cancer. 

Figure shows (in gray lines) the estimated cumulative incidence functions of time 
to type 1-event and time to type 2-event. 
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Table 4: Regression parameters in VM. 


Covariate 

Regression parameters (SE) 
in Cox model for survival 

Regression parameters (SE) 
in logistic model for relative hazard 

Thickness 

0.09 (0.04) 

0.01 (0.12) 

Ulcer 

0.98 (0.26) 

1.46 (0.86) 

Age 

0.02 (0.08) 

-0.04 (0.24) 

Year (stand) 

-0.88 (0.55) 

-0.95 (1.70) 

Sex 

0.42 (0.24) 

0.30 (0.72) 

l{t G (0,1.76]} 


1.93 (1.87) 

l{t G (1.76,2.90]} 


4.24 (2.08) 

l{t G (2.90,4.67]} 


3.84 (1.80) 

l{t G (4.67,15.23]} 


2.61 (1.67) 


Table 5: Estimated piecewise constant relative hazards of cause 1 and their standard error 
for a male patient with ulceration and for average values of the continuous covariates 
derived by htting VM. 



{0.1.76] 

(1.76,2.90] 

(2.90,4.67] 

(4.67,15.23] 

Cause 1 

0.754 (0.141) 

0.968 (0.038) 

0.954 (0.044) 

0.858 (0.106) 


Population cumulative incidences 



Figure 2: The estimated cumulative incidences of time to melanoma and time to other 
causes based on VM (in gray) and on VMCF (in black) for a male individual with 
ulceration and for the mean values of the continuous covariates, together with the 95% 
conhdence intervals derived by htting the former model. 
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Table 6: Regression parameters in VMCF. 


Regression parameters (SE) 


Covariate 

Incidence model 

Latency model 
Conditional Cox Logistic 

Intercept 

-2.62 (0.88) 


Thickness 

0.07 (0.11) 

0.10 (0.06) 0.01 (0.12) 

Ulcer 

0.95 (0.64) 

0.87 (0.50) 1.46 (0.86) 

Age 

0.03 (0.15) 

-0.0006 (0.11) -0.04 (0.24) 

Year(stand) 

0.51 (0.89) 

-1.49 (1.03) -0.95 (1.70) 

Sex 

0.57 (0.47) 

0.42 (0.50) 0.30 (0.72) 

l{t G (0,1.76]} 


1.93 (1.87) 

l{t e (1.76,2.90]} 


4.24 (2.08) 

l{t e (2.90,4.67]} 


3.84 (1.80) 

l{t e (4.67,15.23]} 


2.61 (1.67) 


We then applied the vertical modeling with a cured fraction (VMCF) to these data. 
This model serves three purposes: (1) to estimate the effect of covariates on time to death 
due to each of melanoma cancer and other causes in the population of uncured patients, 
(2) to estimate the cause-specific cumulative incidences in the population of uncured 
patients and (3) to estimate the effect of covariates on the probability of being cured. 
A logistic regression model was postulated for the incidence part and for the latency 
part, a Cox proportional hazards model for the conditional (on Y = 1) total hazard and 
a logistic model for the relative hazards where piece-wise constant time functions were 
used with cut-off points at the quartiles of the failure time distribution function. Note 
that this model for the relative hazards coincides with the one used in VM where the 
evidence of cure was ignored, because the conditional (on T = t) distribution of causes of 
failure depends only on the actual causes of failure observed. All baseline covariates were 
included in both parts (incidence and latency) of the model. No selection of covariates was 
performed to test whether some covariates could be removed. The estimated regression 
parameters are reported in Table 

The estimators of the survival curve S'pop(t|Z) and of the conditional (on Y = 1) sur¬ 
vival curve S{t\Y = 1, Z), for a male individual with ulceration and for average values of 
the continuous covariates, are plotted in Figure [^in dashed and dotted lines, respectively, 
and compared with the estimator of the survival curve S{t\Z) derived from VM (solid 
line). 

The estimated conditional (on Y = 1) cumulative incidence function for a susceptible 
male patient with ulceration and for the mean values of the continuous covariates are 
plotted in Figure]^ The corresponding predicted probability of cure is 1 — p = 0.25. We 
also plotted in Figure (in black) the estimated (unconditional) cumulative incidence 
functions of time to melanoma and time to other causes derived from VMCF, for a male 
individual with ulceration and average values of his other covariates. As expected in 
Figure]^ the estimates for the population S'pop(t|Z) based on our VMCF model is similar 
to that based on VM. 

Finally, it is worth comparing Table with Table it clearly shows that, especially 
for age, interpretation in terms of hazard ratios differs between S{t\Y = 1, Z) and S{t\Z). 
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Estimated survival curve 



Figure 3: The estimated survival curve of time to any event at the mean values of the 
continuous covariates for a male individual with ulceration, based on VM (solid line) 
and on VMCF for the combined population (dashed line) and for the susceptible sub¬ 
population (dotted line), together with the 95% conhdence intervals derived by htting 
the former model. 


Male with ulceration 



Figure 4: Model-based estimates of the conditional (on Y = 1) cumulative incidences of 
time to melanoma and time to other causes for susceptible male patient with ulceration 
and for the mean values of the continuous covariates. 
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5 Discussion 


Little has been published about cure models in combination with competing risks. Several 
reviews have been recently published to highlight the advantages of cure models over the 
limitations of standard methods like Kaplan-Meier, Cox models or parametric models 


for survival data when statistical cure is a reasonable assumption (Jia et ah (2013); Yu 


et ah (2013)). In this paper, we have illustrated a strategy for the statistical analysis of 


competing risks data with a cure fraction. In this way, we have argued that, when there 
is clinical evidence of a cured proportion in a cohort, special attention should be given to 
the heterogeneity present among individuals. 

An important advantage of the competing risks mixture cure model (VMCF) over 
the standard competing risks model (VM) is two-fold: (1) it allows inference of the 
susceptible sub-population, and therefore a better understanding and interpretation of 
the variability of the data and (2) it allows estimation and direct modeling of the cure 
indicator. Summary measures like these can be a useful tool to complement the existing 
statistical measures. Each covariate in VMCF can contribute with up to three sets of 
regression parameters, one parameter reflecting how the covariate affects the chance of 
cure, one parameter for the risk of failure, irrespective of the cause of failure, and one set 
of parameters for the relative position of each failure type among all failure types. 

An appealing technical feature of VMCF resides in its parametrisation, which is il¬ 
lustrated in equation (18). VMCF naturally separates the observed likelihood into two 
factors: one where the cure indicator distribution is irrelevant and one where the cure 
indicator distribution is relevant. The former factor is pertinent to causes of failure when 
a failure occurs; in the absence of competing events, the observed likelihood (18) re¬ 
duces to the observed likelihood of the Cox proportional hazards mixture cure model of 
Sy and Taylor (2000). The latter factor uses information on the failure and censoring 
times and, therefore, it is sensitive to the joint distribution of (T, Y). This feature makes 
our method straighforward to implement by means of smcure package available in 0 


Development Core Team (2010). 


We believe that our approach can play a useful role, as it can accommodate complex 
competing risks data. It is worth mentioning that the approach can be naturally extended 


to deal with missing causes of failure in competing risks. We refer to the work of Nicolaie 


et al. (2011) for a description of how this can be achieved. 


An important issue is that semi-parametric mixture cure models are by construction 
non-identihable. We cannot precisely tell apart individuals who are cured or not among 
those who are censored. However, the presence of individuals with long follow-up and 
event-free works as empirical evidence of the existence of a cured subgroup. We adopt the 
strategy of Sy and Taylor (2000), who approach the non-identihability problem through 
the use of the zero-tail constraint on the baseline failure time distribution. Another 
strategy has been adopted by Peng (2003), who impose a parametric shape on the tail of 
the failure time distribution. 

Another aspect goes to the core of how cure is perceived in the presence of competing 
risks, other than the risk that is related to the disease. It might be the case that a diseased 
patient is at risk of several mutually exclusive causes of failure and their potential cure 
from the disease cannot be observed if they die from accidental causes. For instance, a 
different formulation is proposed by Basu and Tiwari (2010), where separate competing 


15 































risks structures are considered for the cure and the susceptible latent groups and a joint 
prior distribution is assumed on the collection of parameters. Cure is dehned as not 
having experienced death due to the disease-related cause; therefore, the status of cure 
is observed only for individuals subjected to death not due to the cause of interest. We 
are currently adapting the vertical modeling approach to this situation. 
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Appendix A: Derivation of formula (17) 


Using ( [l6| ), the hazard ratio corresponding to two individuals with covariate values given 
by Z and Z is given by 

-^•,pop(^|Z) 

On the other hand, using (|I5|, the same hazard ratio is given by 


A..pop(t|z,y ^ _ E[y|T>t,z,x] 

A.,pop(t|Z,X) "" E[U|T>UZ,X]’ 

Equating the two representations leads to 


r7T(t)(Z-Z) = 7T(Z-Z) + log 


E[Y\T > t,Z,X] 
E[Y\T > t,Z,X] 


whose fc-th component is given by (0. 


Appendix B: The EM-algorithm 

In the E-step of the algorithm the conditional expectation of logL 3 (/ 3 , 7 , Ao(t|U = 1); Y) 
is computed with respect to the distribution of the unobserved Uj’s, given the current 
parameters values and the observed data O = As Uj’s contribute as linear 

terms in log L 3 (/ 3 , 7 , Ao(t|U = 1 );U), it is enough to compute, at a given iteration m, 
the weight = E{Yi\0^'^\X^^\t\Y = 1),0). For an individual i who experiences an 
event at time f* (irrespective of its cause) the corresponding weight is 

= E{Y\6^^\xlr\u\Y = l),T = U,D,e{l,...,J}) 

= PiY, = X^r\u\Y = 1),T = u, A e {1,..., J}) 

= 1 , 


w} 


while for an individual i who is censored at time tj the corresponding weight is 

= E{Y\e^^\xt\u\Y = l),T>U,D, = Q) 

= P{Y, = i\e^'^\xt\u\Y = 1),T > U, A = 0) 


(7-i(/3^Xn ■ So{U\Y = l)exp(7Tzp + 1 _ ^-i(/3Tx*) 




Denote the expected complete log-likelihood by Ep[logL3(/3, 7 , Ao(U|U = 1); 
where 

In the M-step of the algorithm, Ep[logL 3 (/ 3 , 7 , Ao(t|U = 1); is maximized 

with respect to (/3, 7 , Ao(t|U = 1), given Unlike in the standard Cox proportional 

hazards model, where the baseline hazard is seen as a nuisance parameter and eliminated 
in the procedure of estimating 7, one cannot eliminate Ao(t|U = 1) in the Cox propor¬ 
tional hazards model embedded in the VMCF without loosing information about (3. The 
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main reason is that the hazard rate at the population level is no longer proportional 


(see (15) and (16)) and arguments similar to those leading to the Cox partial likelihood 
do not hold true anymore. Peng and Dear (2000), Sy and Taylor (2000) proposed a 


partial likelihood type method to estimate 7 without specifying the nuisance parameter 
Ao(t|D = 1). This involves the updating of the Aalen-Nelson estimator of Ao(f|D = 1) at 
the mth iteration as given by 


A„((|K = 1)= ^ 


di 




E 


l£Ri 


(m) 


exp{j^Zi) 


where di is the number of events at time ti, irrespective of the cause, and Ri is the risk set 
at ti- By substituting Ao(f|D = 1) into log L 32 ( 7 , Ao(t|D = we get the weighted 

partial likelihood of 7 , that is 


n[ 


exp(7^Z^) 

EisR, w^f™Exp(7TZz)- 




To assure identihability, we impose the zero-tail constraint (see 
that is, S'o(t|D = 1 ) = 0 for f 


Sy and Taylor (2000)), 


Appendix C: The standard errors of conditional (on 
Y = 1) cumulative hazards from vertical modeling with 
a cure fraction 


In this appendix we derive the formula for the standard error of the conditional (on Y = 1) 
cause-specihc cumulative hazard from vertical modeling with a cure fraction, when we 
omit the vector Z of covariates from the model of the latency component. As a conse¬ 
quence, the vector of parameter describing VMCF is (/3, 77 , X,(ti\Y = 1),..., \,(tK\Y = 
!))• 


The Nelson-Aalen estimator of A,(f|y = 1), denoted by A,(f|y = 1), makes jumps 
of size dA,(t\Y = 1) at event time points 0 = to < ti < t 2 <...< tx < 00 . An 
approximation of the covariance matrix of {f3,dA,{ti\Y = 1),... ,dA,{tK\Y = 1)) is 
given by the inverse of the observed full information matrix ( 1 ^= 1 ) • 

Remark. The matrix rfA.{-|v=i) diagonal. Let denote an inverse 

^pdA.( |v=i) “(iA.(-|y=i) sub-matrix of corresponding to 


dA,{-\Y = 1). 

Relevant quantities for our purposes are the relative hazards Tijit)] we model them as 
in formula ( 12 ). 


Remark. Often it will convenient to retain the system (12) and to work with the r x J 
Fisher information matrix of = (r^i,..., r/j) , denoted by which has rank r( J — 1) 
and, in particular, is not invertible. Let denote a Moore-Penrose generalized inverse 
of 

We are interested to develop a formula for the JK x JK covariance matrix var(A(- |y = 
1)) =: Ha of the estimator Aj{t\Y = 1) = <t^ 3 {'ts)X,{ts\Y = 1) = <*^^* 7 = 1 , 
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where Xjg Y=i = ^j{ts)^»{ts\Y = !)• First, we introduce some notation, as follows: 

T = {ri,\.{t,\Y = l),...,\.{tK\Y = l)y , 

A = ((A,(ti|F = {Aj{t2\Y = ..., {Aj{tK\Y = 


and 


A — (Aii,y=i, A21,y=l, . . . , Aji,y=i, Ai2,y=l, • • • , Aj2,y=l, Aii^,y=i, ..., Ajx,y=i) • 

According to the Delta-method, we get 


(21) Ha = 


aA(-|F = l) 9A(-|F = 1) 


( 9Ai-\Y = l) dXi-\Y = l) Y 
^ ’ UA(-|y = i) Ot ) 


aA(-|F = l) dr ^ ' V9A(-|y = l) 

It is straightforward to see that the matrix of order JK is given by 


/ Ijxj 0 0 

IjxJ Ijxj 0 


( 22 ) 


dA 


0 \ 
0 


\ IjxJ IjxJ 

where Ijxj is the identity matrix of order J. Also, we have 

d\js,Y=l 


0 

IjxJ 


drji^ 


= -Tlj{tg)Tli{ts)\,{ts\Y = l)Bu{ts) , 


where j, / e {1,..., J}, j 7 ^ /, s e {1,..., A}, m e {1,..., r}, and 

d\jsY=l 


di] 


= TTjitg) [1 - TTjits)] X,{ts\Y = l)Bu{ts) , 


'ju 


where j G {1,..., J}, s G u G 

Moreover, 

d\js^Y=l _ / N . 

d\.{U\Y = l) -^A^s)0s,v , 

where j G {1, . .., J}, s, u G {1,. .., A} and 6 stands for the Kronecker delta. 
We shall define, for t > 0, the J x J matrix fl{t) as follows: 


fl{t) = 


[ T^lit) 0 
0 Tl2{t) 


0 \ 
0 


- (7ri(t), 7r2(t),..., 7rj(t)) (7ri(t), 7r2(t),..., Tijit)) , 

, .. I 

\ 0 0 

and the r-vector 

exit) = (A.(t|F = l)Ai(t), A.(t|F = l)A 2 (t),..., A.(t|F = l)A,(t))^. 
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Setting 


n{ts) = {7ll{t,),7l2{ts), . . .,7lj{ts)y, 

a column vector of length J, for s G {1,..., we get 

9(Ai5_y=l, A2s,y=l, ■ ■ ■, Ajs,y=i) 


(23) 


5(hnh2,---,hj) 
where 0 stands for the Kronecker product, and hnally 


= Q,{ts) 0 {oL{ts)) , s G {1 ,..., it'} , 


<-) 


/ f 2 (ti) ® {a{ti)y n(ti) 0 0 

f 2 (t 2 ) ® (Q:(t 2 ))"^ 0 n(t 2 ) 0 

0 


0 \ 
0 


0 


\ d{tK) ^ {OL{tK))^ 0 0 0 ... 0 Il{tK) / 

which is a matrix of order JK x (Jr + K). As a result, we obtain 

I 0 


(25) 


d\ 


0 


'd\\^ 


"dA.(-|y=i) 


In conclusion, using (21), (22) and (25), we have that 


(26) 


where 


and 


'—'A 


/ WiH^Wi + Hi WiH^W 2 + Hi ... WiH^W;^ + ^ 

W2H^Wi + ni W2H^W2 + n2 ... W2H^Wx + n2 

V w^H^Wi + n, w^h^W 2 + n2 ... + Uk J 

k 

Wk = J 2 ® {o^{ts)V, ke{l,...,K} , 

S=1 

k k 

hlfc; = '^dA.My=il ^ ^ ^ n(t5)(II(t;)) ), i, / G {1, . . . , iJ} . 


1=1 


5=1 
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